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Abstract 

Let  be  an  estimate  of  3  in  the  linear  model,  =  x?3  +  e^ .  Define  the 

residuals  Y.  -  x!  (L,  let  0  <  a  <  h,  and  let  g.  be  the  least  squares  estimate  of 
1  — 1  — U  — L 

g  calculated  after  removing  the  observations  with  the  [an]  smallest  and  [an] 
largest  residuals.  By  use  of  an  asymptotic  expansion,  the  limit  distribution  of 
3.  is  found  under  certain  regularity  conditions.  This  distribution  depends 

L 

heavily  upon  the  choice  of  J3q.  We  discuss  several  choices  of  J3q,  with  special 
attention  to  the  contaminated  normal  model.  If  is  the  median  regression  or 

A  , 

least  squares  estimator  then  3^  is  rather  inefficient  at  the  normal  model.  If  F  is 

A 

symmetric,  then  a  particularly  convenient,  robust  choice  is  to  let  3q  equal  the 
average  of  the  ath  and  (l-a)th  regression  quantiles  (Koenker  and  Bassett, 
Econometrica  (1978)).  Then  3^  has  a  limit  distribution  analogous  to  the  trimmed 

A 

mean  in  the  location  model,  and  the  covariance  matrix  of  3^  is  easily  estimated. 
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1.  Introduction.  This  paper  is  concerned  with  the  linear  model 
(1.1)  y  =  X$  ♦  z , 

where  y'  =  (yj,...,yN),  X  is  a  Nxp  matrix  of  known  constant,  6,'  =  (3j,...,g  ) 
is  a  vector  of  unknown  parameters,  and  £'  =  (Zj,...,z^)  is  a  vector  of  i.i.d. 
random  variables  with  distribution  function  F.  The  least  squares  estimator  of 
3  is  said  to  be  non-robust  because  it  possesses  two  serious  disadvantages, 
inefficiency  when  F  has  heavier  tails  than  the  Gaussian  distribution  and  high 
sensitivity  to  spurious  observations.  These  deficiencies  are  closely  related 
and  Huber  (1977,  p.  3)  states  that  "for  most  practical  purposes,  'distributional 
robust'  and  'outlier  resistant'  are  interchangeable".  In  the  location  model, 
three  classes  of  estimators  have  been  proposed  to  overcome  these  deficiencies: 

M,  L,  and  R  estimators;  see  Huber  (1977)  for  an  introduction.  Among  the 
L-estimates,  the  trimmed  mean  is  particularly  attractive  because  it  is  easy  to 
compute,  is  rather  efficient  under  a  variety  of  circumstances,  and  can  be  used  to 
form  confidence  intervals  (Gross  (1976)  and  Huber  (1970)).  Hogg  (1974)  favors 
trimmed  means  for  the  above  reasons,  and  because  they  can  serve  as  a  basis  for 
adaptive  estimators.  Stigler  (1977)  applied  robust  estimators  to  historical  data 
and  concluded  that  "the  10%  trimmed  mean  (the  smallest  nonzero  trimming  percentage 
included  in  the  study)  emerges  as  the  recommended  estimator".  It  is  therefore 
natural  to  seek  a  trimmed  least  squares  estimator  for  the  general  linear  model 
which  possess  these  desirable  properties  of  the  trimmed  mean. 

For  the  linear  model,  Bickel  (1973)  has  proposed  a  class  of  one-step 
L-estimators  depending  on  a  preliminary  estimate  of  g,  but,  while  these  have  good 
asymptotic  efficiencies,  they  are  computationally  complex  and  are  generally  not 
invariant  to  reparameterization. 


Recently,  Koenker  and  Bassett  (1978)  have  extended  the  concept  of  quantiles 


to  the  general  linear  model.  They  suggest  the  following  trimmed  least  squares 
estimator,  (LD(a)(=  (LD);  define  the  ath  and  (l-a)th  regression  quantiles  8(a) 
and  3(l-a)  (see  their  paper  for  a  definition  of  regression  quantile),  remove  from 
the  sample  any  observation  whose  residual  from  8(a)  is  negative  or  whose  residual 
from  §(l-a)  is  positive,  and  calculate  the  least  squares  estimator  using  the 
remaining  observations.  In  the  location  model,  this  estimator  reduces  to  the 
a-trimmed  mean.  Ruppert  and  Carroll  (1978)  studied  the  large  sample  behavior  of 
_3^g  (p  fixed  and  N  -+  <=°)  and  found  that  the  variance  of  N  2  §^g(ot)  is 

2  - 1  - 1  .  2 

approximately  a  (a,F)  (N  X'X)  ,  where  in  the  location  model  a  (a,F)  is  the 

L 

asymptotic  variance  under  F  of  the  a-trimmed  mean  (also  normalized  by  N 2) . 

^-^In  this  paper  wo  investigate  a  class  of  estimators. that  ^present  a  third 


a  class  of  estimators. 


bpresent  a  third 


possible  method  of  defining  a  regression  analogue  of  the  trimmed  mean.  Specifi- 
cally,  let  J3q  be  a  preliminary  estimator.  Form  the  residuals  from  8q  andN^emove 
from  the  sample  those  observations  corresponding  to  the  [Na]  smallest  and  [Na] 


largest  residuals.  Then  the  a-trimmed  least  squares  estimator, 

§,  (a)  (=§.),  is  a  least  squares  estimator  using  the  remaining  observations. 

The  definition  of  JL  was  motivated  by  the  applied  statisticians'  practice  of 
examining  the  residuals  from  a  least  squares  fit,  removing  the  points  with  large 
(absolute)  residuals,  and  recalculating  the  least  squares  solution  with  the 
remaining  observations.  Generally,  there  is  no  formal  rule  for  deciding  which  points 

A 

to  remove,  but  8,  is  at  least  similar  to  this  practice.  Furthermore,  the  authors 

Lj 

do  know  of  practitioners  who  have  used  8^. 

Theorems  1  and  2,  which  are  a  general  results  allowing  a  wide  class  of  prelim- 

A 

inary  estimates,  give  asymptotic  representations  for  8^.  These  representation 
enables  one  to  calculate  the  asymptotic  bias  (which  is  0  if  F  is  symmetric  and 
is  unbiased)  and  variance  of  8,  • 
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When  the  preliminary  estimate  is  the  least  squares  or  median  regression  (Lj) 
estimate,  two  somewhat  surprising  conclusions  emerge.  First,  for  neither  choice 
is  §,  a  multivariate  analogue  to  a  trimmed  mean.  Second,  either  choice  causes 

Li 

A 

B,  to  be  inefficient  at  the  normal  model,  particularly  when  compared  to  the  Koenker 
and  Bassett  estimate  or  the  M-estimates.  For  symmetric  F,  we  show  that  the  "right" 
choice  of  a  preliminary  estimate  is  a  regression  analogue  to  averaging  the  and 
(l-ot)**1  sample  quantiles. 

Hogg  (1974,  p.  917)  mentions  that  adaptive  estimators  can  be  constructed  from 

A 

estimators  similar  or  identical  to  S^Ca)  with  a  a  function  of  the  residuals  from 
Bq.  The  advantage  of  this  class  of  adaptive  estimators,  he  feels,  is  that  they 
"would  correspond  more  to  the  trimmed  means  for  which  we  can  find  an  error  struc¬ 
ture"  .  However,  from  the  above  results,  we  can  conclude,  that  even  if  a  is 
non-stochastic,  estimators  of  the  type  suggested  by  Hogg  will  not  necessarily  have 
error  structures  which  correspond  to  the  trimmed  mean. 

The  methods  of  this  paper  can  be  applied  to  estimators  similar  to  B^.  For 

A  A 

example,  let  B^(a)  (=  J^)  be  the  least  squares  estimate  after  the  points  with  the 
[2aN]  largest  absolute  residulas  from  are  removed.  In  section  6  we  state  results 

A 

for  B^.  Their  proofs  are  omitted,  but  are  similar  to  the  proofs  of  analogous  results 
for  B^. 

2.  Notation  and  Assumptions.  Although  y^,  X  and  z_  in  (1.1)  depend  upon  N, 
this  will  not  be  made  explicit  in  the  notation.  Let  £'  =  (1,0,..., 0)  (lxp)  and  let 
Ip  be  the  pxp  identity  matrix.  For  0  <  p  <  1,  define  ^  =  F_1(p). 

Throughout,  we  will  make  the  following  three  assumptions. 

Cl.  N*  (Bp-B)  =  0p(l) 

C2.  Fix  0  <  a  <  1,  and  define  £  =  £  and  £-  =  £, 

1  ^ a  2  ^1-a 

Assume  F  has  a  continuous  positive  density  f  in  neighborhoods  of  and 
C3.  Assume  *  1  for  i  =  1,...,N, 
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(2.1)  lim  [N"2  max  |X.  .  |]  =  0  , 

N-*»  i^l,j<p  1-) 

N 

(2.2)  l  X  =  0  for  j  =  2, . . . ,p, 

i=l  J 

i.e.,  the  design  is  centered,  and  for  Q  some  positive  definite  matrix 

(2.3)  lim  N-1  X'X  =  Q. 

N-*» 

Note  taht  the  probability  distribution  of  Y  is  unchanged  if  we  replace  _3  by 
J3  +  0e  and  F(.)  by  F(-  +  0)  where  0  is  any  real  number.  Because  of  (2.2),  many 

A 

possible  preliminary  estimates,  g^,  satisfy 

n'2(L  -  3  -  0e)  =  0(1) 

-0  -  -  p 

for  some  0.  In  particular,  the  LAD  (least  absolute  deviation  or  median  regression) 
estimate  has  this  property  (Ruppert  and  Carroll  (1978)).  In  this  case,  we  can 
reparameterize  so  that  Cl  holds. 

The  residuals  from  the  preliminary  estimate  g^  are 

(2.4)  ri  =  yi  ‘  =  zi  -  (1  -  1) • 

Let  r^N  and  r^^  be  the  [Na]th  and  [N(l-a)]th  ordered  residuals,  respectively.  Then 
the  estimate  is  a  least  squares  (LS)  estimate  calculated  after  removing  all 
observations  satisfying 

(2.5)  ri  ^rlN  °r  ri  ^r2N  • 

Because  of  C2,  asymptotic  results  are  unaffected  by  requiring  strict  inequalities 
in  (2.5).  Let  a.  =  0  or  1  according  as  i  satisfies  (2.1)  or  not,  and  let  A  be  the 
NxN  diagonal  matrix  with  A. .  =  a. .  Thus 
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Jj/a)  =  (X' AX) ”  X' Ay, 


where  (X'AX)"  is  a  generalized  inverse  for  X'AX.  (Later  we  show  that 
N "l(X'AX)  5-  (l-2a)Q,  whence  P(X'AX  is  invertible)  ■+  1.) 

3.  Main  Results.  The  analysis  of  the  asymptotic  behavior  of  §L(a)  relies 
heavily  on  techniques  developed  by  Ruppert  and  Carroll  (1978).  The  proofs  are 
sketched  in  the  appendix.  Lemma  1,  which  may  be  of  some  interest  in  intself,  is 
an  asymptotic  linearity  result  and  is  a  generalization  of  work  by  Bahadur  (1966) 
and  Ghosh  (1971)  for  the  location  model. 

For  0  <  6  <  1,  define 


(3.1) 


i|>Q(x)  =  6  -  I(x  <  0)  . 


Lemma  1 .  For  0  =  a  or  (1-a),  let  r0N  be  the  [N0]th  ordered  residual.  Then, 
(3.2)  nV9n-£0)  ■  f(5e)-1[N-'i  ji  yV59)l-  e'  N^-B)  .  op(l)  . 

Theorem  1.  Define  a  =  C2f(S2)  -  ^  f(£j),  =  (I-e/e)^  =  (0,xi2> . . .  ,xip) ' 

and 


(3.3)  h(x)  =  xKCj  1  X  <  C2)  +  C2CI Cx  >  s2)  -  a)  ♦  Cj(I(x  <  5j)  -  a ) 


Then, 

(3.4) 


(1-2o)N''s(|l-0)  =  N'!5  l  Q"1  c.  z.  I(£  <  z  <  £  ) 

i=l 

,  N  , 

+  N-  2  l  e  h(z.)  +  a  N'1  (I-e  e’)(jL-g)  +  o  (1)  . 
i=l  1  p 

For  our  next  theorem  we  require  another  condition. 

C4 .  For  some  function  g. 


Nh  (Jq-6)  =  N-5*  l  Q-i  x.  g(z.)  ♦  op(l)  . 


i  N  i 
V  ^-1 
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As  is  well-known,  C4  holds  with  g(x)  =  x  if  B^  is  the  LS  estimate.  By 
Ruppert  and  Carroll  (1978),  Theorem  2),  C4  holds  with 

g(x)  =  (f  (F- 1 (0) ) ) ~ 1 (%  -  I(x  <  F~ 1 (0) ) )  if  Bq  is  the  LAD  estimate.  As  a  conse¬ 
quence  of  Theorem  1,  we  have  our  main  result. 

Theorem  2.  Assume  C4.  Then 
i  ,  N 

(3.5)  (l-2a)  N^CfUjS)  =  N-"2  £  Q'1  C  (Z.  1(5  <  Z.  <  ?2)  +  a  g(Z.) } 

i=l 

+  N  2  l  e  h(Z.)  +  o  (1). 
i=l  -  1  P 

As  a  special  case  of  corollary  1,  we  obtain  a  result  of  deWet  and  Venter 
(1974) . 


Corollary  1,  In  the  location  model  (p=l  and  x^=l  for  all  i) 

i  i  N 

(1-2 a)  N"2(|.-g)  =  N'"2  l  h(Z.)  +  o  (l). 

“  i  =  l  1  P 

4.  Asymptotics.  In  this  section  we  show  the  Theorem  2  leads  to  the  basic 


conclusions: 


1)  The  intercept  estimate  is  asymptotically  unbiased  if  F  is  symmetric. 

2)  The  slope  estimates  are  asymptotically  unbiased  even  if  F  is  asymmetric. 

3)  The  asymptotic  variance  of  the  intercept,  which  does  not  depend  upon  the 

A 

choice  of  Bq,  is  that  of  the  trimmed  mean  in  the  location  model. 

A 

4)  The  asymptotic  covariance  matrix  of  the  slopes  depends  upon  Bq  and,  in 
general,  will  be  difficult  to  estimate. 

Let  £  be  a  (p-1)  *  1  vector  of  zeroes.  By  (2.2),  there  is  a  Q  such  that 


and  Q 


(4.1) 
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Moreover, 

(4.2) 

and 

(4.3) 


N"1  l  C.  C!  = 
i=l  -1  -1 


,  N 

Q  N  1  y  C.  = 
i=l  -1 


0  O' 
0  Q 


0 

0 


We  will  call  the  first  entry  of  g  the  intercept  and  the  remaining  entries  will 
be  call  the  slopes.  If  we  estimate  g  with  g^,  then  the  asymptotic  bias  of  the 
intercept  is 

1 


E  h(Z. )  =  (l-2o0  /  x  dF(x) 

4 


which  is  zero  if  F  is  symmetric  about  zero.  By  (3.4)  and  (4.3)  the  slope  estimates 
are  asymptotically  unbiased,  even  if  F  is  asymmetric.  The  asymptotic  variance  of 
the  intercept,  normalized  by  N  2,  is 


(4.4) 


o2(a,F)  =  (l-2a)"2  Var  h(Zi) 


the  asymptotic  variance  of  the  normalized  a-trimmed  in  the  location  model.  The 

intercept  is  asymptotically  uncorrelated  with  the  slopes,  and  the  asymptotic 

~-l  2 

covariance  matrix  of  the  normalized  slopes  is  Q  a  (a,g,F)  where 

(4.5)  o2(a,g,F)  =  (l-2a)'2  VarfZj  I(q  1  ^  1  S23  +  a  g(zx))  • 

We  see  that  the  asymptotic  distribution  of  the  intercept  estimate  does  not  depend 
upon  the  choice  of  g^  provided  (Bq-W  =  Op(N  2)  . 

On  the  other  hand,  we  see  from  (3.4)  that  the  slope  estimates  depend  upon  §q, 
since  the  unusual  situation  where  a  =  0  is  ruled  out  by  assumption  C2.  Using  the 
Lindeberg  central  limit  theorem  and  corollary  1,  it  is  easy  to  show  that  under  C4, 

i  - 1 

N 2  (g^  -  JJ  -  e  (l-2a)  EhfZ^))  converges  in  distribution  to  a  normal  law. 


In  general,  large  sample  statistical  inference  based  on  JL  will  be  a 
challenging  problem,  because  of  the  difficulties  of  estimating 
a  =  (£2  f(£2)  "  f(5j)).  Obtaining  reasonably  good  estimates  of  the  density 

f  might  take  very  large  sample  sizes. 

5.  A  Close  Analog  to  the  Trimmed  Mean.  There  is  one  choice  of  (the  average 
of  the  a**1  and  (l-oc)^  "regression  quantiles".)  for  which  the  asymptotic  covariance 
matrix  of  3^  is  relatively  simple  to  estimate  when  F  is  symmetric  about  0.  For 
0  <  0  <  1,  let  j3(0)  be  the  0  the  regression  quantile  (Koenker  and  Bassett  (1978)). 

Let  5(0)  =  F  *(0)  and  define  ^(x)  =  0  -  I  (x  <  0) .  By  theorem  2  of  Ruppert  and 
Carroll  (1978),  if  F  has  a  continuous  positive  density  f  in  a  neighborhood  of  £(0),  then 

(5.1)  (|(0)  -  6  -  «0)e)  =  (f(Ue))"1  Q'1  l  x.  ipa(i.  -  5(0))  +  o  (1)  . 

i=l  1  0  1  p 

Let  jJ^RQ)  equal  when  80  =  (8(a)  +  |(l-a))/2.  By  C2  and  (5.1)  this 
satisfies  (C4)  with 

g(x)  «  (2  f(5j))_1  ^(x-gj)  +  (2  f (C2) )  1  ^_a(x-52)  . 

If  F  is  symmetric,  then  5j  =  -52>  f(5j)  =  f(52).  and  therefore 

(5.2)  a  g(x)  =  5j  I(x  <  5j)  +  52  I(x  >  5)  . 

By  (3.5)  and  (5.2), 

(5.3)  (l-2a)  H'z  (j^-3)  =  N"*2  £  Q_1  x.  h(Z.)  +  o  (1)  , 

1=1  * 


and  therefore  by  (4.4), 
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If  we  examine  deWet  and  Venter's  (1974)  representation  of  the  trimmed  mean  (cf. 
corollary  1  of  this  paper),  we  see  that  (5.3)  is  a  generalization  of  their  result 
to  the  general  linear  model.  Therefore,  this  appears  to  be  the  "correct"  choice. 
Also  by  theorem  3  of  Ruppert  and  Carroll  (1978) 


(5.4) 


nHb-!l>  1  0  ’ 


so  that  asymptotically  there  is  no  difference  between  trimmed  with  this  preliminary 

estimate  and  using  Koenker  and  Bassett's  (1978)  proposal.  (However,  (5.4)  does 

not  necessarily  hold  if  F  is  asymmetric.) 

Let  8,  (LS)  and  8,  (LAD)  be  8,  when  fL  is  the  LAD  and  LS  estimate,  respectively. 

— L  L  — L  — U 

2 

Table  1  displays  a  (a,g,F)  for  several  choices  of  a,  e,  and  b,  and  for  g  corre¬ 
sponding  to  j^(LS) ,  8^ (LAD) ,  and  ^(RQ).  For  comparison  purposes,  we  include  the 
asymptotic  variance  of  the  LS  estimate,  Huber's  proposal  2  M-estimate,  and  a 

one-step  Hampel  estimate  using  Huber's  proposal  2  as  a  preliminary  estimate  (Huber's 

2 

(1973),  (1977)).  (By  asymptotic  variance,  we  mean  a  where  the  asymptotic  covan- 
ance  is  a  Q~  ).  For  discussion  of  the  last  two  estimates  see  Carroll  and  Ruppert 
(1979).  Several  conclusions  emerge  from  Table  1. 

1)  j^(LS)  and  _S(LAD)  are  rather  inefficient  at  the  normal  distribution. 

A 

2)  j^fRQ)  is  quite  efficient  at  the  normal  model. 

3)  Under  heavy  contamination  (b  large  or  e  large)  ^(LS),  g(LAD) ,  and  §^(RQ) 
are  relatively  efficeint  compared  with  LS.  Also  J^(RQ)  and  ^(LAD)  compare 
well  against  the  M-estimates,  but  j^(LS)  does  poorly  compared  to  the 
M-estimates  if  e  =  .25,  b  =  10,  and  a  =  .25.  (Intuitively,  one  can  expect 
that  when  a  =  .25,  ^(LS)  will  be  heavily  influenced  by  its  preliminary 
estimate,  which  estimates  8  poorly  for  these  b  and  e.) 

Because  of  1)  and  3),  the  practice  of  fitting  by  least  squares  or  LAD,  removing 

points  corresponding  to  extreme  residuals,  and  computing  the  least  squares  estimate 

from  the  trimmed  sample,  is  not  an  adequate  substitute  for  robust  methods  of  estimation. 


Outfit  Iililii1 1’ 
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If,  instead  of  removing  those  observations  with  the  [Na]  smallest  and  [Na] 
largest  residuals  from  g^,  we  remove  those  observations  with  the  [2Na]  largest 
absolute  residuals,  then  the  asymptotic  variance  of  the  intercept  is  the  same  as 
that  of  the  slopes.  Specially,  let  ^(a) (=  j^)  be  the  estimate  formed  in  this 
manner.  Then,  if  F  is  symmetric, 
i  ,  N 

(6.1)  (l-2a)N'2  (j^-g)  =  N'2  l  Q'1  x.{Z.  I  <  Z.  <  *  a^-g)} 

i=l 

and  if  C4  holds,  then 

,  ,  N 

(6.2)  (l-2a)N  2  (0^-0)  =  N-'2  [  Q_1  x.{Z.  I(q  £  Z.  <  £->)  *  a  g(Z.)  } 

i  =  l 

which  in  the  location  case  reduces  to 


(6.3)  (l-2a)N 2  (j^-0)  =  N-  2  l  {Zt  I{^  <  Z.  <  *  a  g(Z.)}. 

i=l 

The  proofs  are  similar  to  those  of  theorems  1  and  2  and  are  omitted. 

Since  is  particularly  easy  to  compute  in  the  location  model,  it  is  very 

— r\ 

suitable  for  Monte  Carlo  studies.  It  is  hoped  that  such  studies  will  indicate 
the  degree  of  agreement  between  the  asymptotic  and  finite  sample  variances  of 
as  well  as  §A.  Table  2  displays  the  variance  of  §^(LS) ,  i.e.  0^  with  the 
LS  estimate,  for  sample  sizes  of  N  =  50,100,200,300,  and  400.  The  Monte-Carlo 
swindle  (Gross  (1973))  was  employed  as  a  variance  reduction  technique.  One 
sees  from  this  table  that  convergence  of  the  variance  to  its  asymptotic  value  can 
be  extremely  slow  for  some  distributions,  e.g.  b  =  10  and  e  =  . 10  or  .25. 


7.  Conclusions.  Despite  their  intuitive  appeal,  trimmed  regression  estimates 
based  on  an  arbitrary  preliminary  estimate  will  not  be  very  satisfactory.  However, 
provided  the  error  distribution  is  symmetric,  there  is  one  such  estimate  that  is 
closely  analogous  to  the  trimmed  mean  in  the  location  model. 


I 
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Appendix 

Proposition  A.l.  For  0  =  a  or  (l-2a)  let  y^,  be  a  sequence  of  solutions  to 


Then, 


l  (vV  WV  =  min- 

i=l 


N"'"2  l  WV  =  °p(1) 

1=1 


Proof.  The  argument  is  very  similar  to  that  of  theorem  1  of  Ruppert  and 
Carroll  (1978)  and  will  be  omitted. 

Proof  of  lemma  1.  As  pointed  out  by  Koenker  and  Bassett  (1978),  y  =  rQN 


is  a  solution  to 


N 


l  (r.-y)  ^fi(r • -y)  =  min  , 
i=l  1  01 


so  that  by  Proposition  A.l,  for  0  =  a  or  (1-a), 
(Al) 


N  2  £  ^Zi  “  ^0  "  ?i^2  (io'iy  +  ?^r0N-^0^  _  °p(-1'> 

i=l 


Here,  we  use  the  fact  that  x|e  =  1.  Define  the  processes 

,  N 

Vn(A)  =  N'^2  l  *  (Z  -  e0  -  X*  A/n"2) 
i=l 


and 


Wn(A)  =  Vn(A)  -  VN(0)  -  E(Vn(A)  -  vN(0)). 

Following  Bickel  (1975)  or  as  a  special  case  of  Lemma  A2  of  Ruppert  and  Carroll 
(1978),  for  all  M  >  0, 

(A2)  sup  |W  (A)  |  =  o  (l)  , 

0<||A||<M  N  r 

and 

— m  . mi  t  — fc  IWWWM I  r -  m  , 
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(A3) 


dffii*  |V”(M  -  v»(0)  *  £tv  e'A|  *  V1’ 


Further,  following  the  method  of  Jureckova  (1977)  or  Lemma  A. 3  of  Ruppert  and 
Carroll  (1978),  for  all  e  >  0  there  exists  p,  K,  and  N0  such  that 


(A4) 


P{  inf  I V  ( A) I  <  p)  <  £  }  for  N  >  Nn  . 

1141  I*  "  -  ° 


By  (Al)  and  (A4)  we  have  that 


(A5) 


6N  *p' 

so  by  substituting  the  RHS  of  (A5)  for  A  in  (A3)  we  obtain  by  (Al)  that 

i  N  i 

(A6)  N-"2  l  ^e(Z.-C0)  -  f(50)  e'  N^tfJ^-B)  ♦  £(r0N-?0)}  =  o  (1) .  □ 

i=l  ~  p 

Proposition  A. 2.  (Lemma  A. 4  of  Ruppert  and  Carroll  (1978)).  Let 

D^(=  D^)  be  a  rxc  matrix  whose  (2,,k)th  component  is  denoted  by  Suppose 

-1  N  2 

lim  N  £  D7.,  exists  for  all  l  and  k. 

N*  m  1  X»K 
-POO  1  =  1 

Let  h(x)  be  a  function  defined  for  all  real  x  that  is  Lipschitz  continuous  on  an 
open  interval  containing  and  ^ •  For  Aj,  A2>  anc*  A^  in  Rp  and  A  =  (A^.A^A^), 
define 

,  N  ,  ,  , 

T(A)  =  N~  2  l  D.  h(Z  ♦  A3/N"2)  I  +  X!  Aj/N'2  <  2.  <  £2  +  x!^  A2/n"2}  . 

i=l 

Define 

S(A)  =  T(A)  -  T(0)  -  E(T(A)  -  T(0) )  . 

Then,  for  all  M  >  0, 


sup  I |S(A) | |  =  o(l) 
0<| I A I  I <M  p 


y*'n'  -  ^  ■ 
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Proof  of  theorem  1 .  For  Aj,  A., 

1  N 

U(A)  =  N_1  l  x  xf  I(C  + 
i=l 


in  RP  and  A  =  (A^.A.,),  define 
x[  Aj/N*5  <  Z.  <  C2  ♦  x!  A^N*5} 


and 

,  N  ,  , 

W(A)  =  N"'2  l  x.  z.  I{£  +  x!  Aj/N'*  <  Z.  <  ^  +  x!  A2/N^}  . 
i=l 

Using  Proposition  A. 2,  it  is  easy  to  show  (cf.  Ruppert  and  Carroll  (1978),  proof 
of  theorem  3)  that  for  all  M  >  0, 

(A7)  sup  |U(A)  -  (l-2ot)Q|  =  o  (1) 

0<||A||<M  ~  P 

and 


(A8)  sup  |W(A)  -  W(0)  -  Q(A2  5,  f(C2)  -  Aj  ^  fUj))  I  =  o  (1)  . 

0< |  |  A  |  |  <  M  ~  p 

Then  using  the  fact  that  jc !  £  =  1,  we  have 

I(rlN  i  ri  ^  r2N!  *  I{?1  *  *  £(rlN-5l»  i  Zi 

<  52  *  ((Jo-0  *  l(r2N-C2)> 

and  so  replacing  A^  by  N2((Jg-j3)  +  (jfr^-^) ,  for  SL  =  1,  2,  in  (A7)  and  A8),  we 
have 

(A9)  N'1  (X'AX)  =  (l-2a)Q  +  op(l) 

and 

N~li  X'A(y-AX  6)  =  W(0)  +  Q{^  f(^)  ^  -  g  +  e(r2N-^)) 

-  q  f(q)  ^  (Jq  -  0  ♦  £Cr1N-^))}  ♦  op(l)  • 


(A10) 
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By  (A9)  and  (A10) , 

(All)  (X'A(y-AX  B))  =  (l-2a)  ^(§^-6)  ♦  o  (1). 

By  (A10),  (All),  and  (3.2) 

(A12)  (l-2a)  Q(Jls-J)  =  W(0) 


*  Q{52  ^  N‘‘S  J,  *l-a  (Zi-«2> 

♦  Ns  a(I-e  e')(|  -8)J  .  o  (1) 

- — u  —  p 


h  - 


N 

l  <P  (Z.- 
i=l  01  1 


Then  (3.3)  follows  from  (A12),  (3.1),  and  the  definition  of  W(0). 


Proof  of  corollary  2.  By  (2.2),  the  first  row  of  Q  is  e 1 .  Therefore,  the 
first  row  of  Q  1  is  also  £.  Consequently,  (I-£  e_' )  Q_1  x_  =  Q_1  .  Thus, 

substituting  (3.4)  into  (3.3)  completes  the  proof. 


hm 


•Proportion  of  contamination 
••Standard  deviation  of  contamination 


I 
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Table  1  -  Variances  of  the  asymptotic  distribution  of  slope  estimators- (The  asymptotic 
_ covariance  matrix  is  Q~1  multiplied  by  the  displayed  quantity). _ 
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Ji  A 

Table  2  -  Finite  and  Asymptotic  Variances  of  N  S^(LS) 
in  the  Location  Model 


♦ 

€ 

b++ 

N*=50 

NI**=1000 

N=100 

NI=1000 

N=200 

NI=500 

N=300 

NI=500 

N=400 

NI=850 

Asymptotic 

NORMAL 

1.31 

1.36 

1.37 

1.32 

1.35 

1.36 

.05 

3 

1.47 

1.49 

1.50 

1.47 

1.48 

1.51 

.05 

5 

1.57 

1.65 

1.70 

1.66 

1.65 

1.71 

.05 

10 

2.10 

2.36 

2.54 

2.51 

2.40 

2.66 

.10 

3 

1.58 

1.58 

1.65 

1.63 

1.60 

1.64 

.10 

5 

1.74 

1.83 

1.97 

1.90 

1.90 

1.96 

.10 

10 

2.24 

2.51 

2.92 

2.99 

3.03 

3.32 

.25 

3 

2.01 

1.93 

1.94 

1.96 

1.96 

1.97 

.25 

5 

2.12 

2.05 

2.08 

2.11 

2.07 

2.09 

.25 

10 

2.98 

2.42 

2.14 

2.13 

2.11 

1.88 

Proportion  of  contamination 
++Standard  deviation  of  contamination 
*Sample  size 

**Number  of  Monte-Carlo  simulations 
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Let  £q  be  an  estimate  of  £  in  the  linear  model,  Yj  =  xj£  +  e^.  Define  the 

residuals  Yi  •  iiio-  let  0<a<l/2,  and  let  be  the  least  squares  estimate  of  8_ 

calculated  after  removing  the  observations  with  the  [an]  smallest  and  [an]  larges 
residuals.  By  use  of  an  asymptotic  expansion,  the  limit  distribution  of  g. ^ is 
found  under  certain  regularity  conditions.  We  discuss  several  choices  of 
with  special  attention  to  the  contaminated  normal  model. 
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